Textbooks
Required
This course makes use of several textbooks. I have attempted when possible to choose resources which are freely available.
- Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani and Jonathan Taylor. Introduction to Statistical Learning with Applications in Python
This book provides a strong practical introduction to machine learning. It’s authors led the development of machine learning and the modernization of statistical theory to accomodate it. The book is excellent at teaching the intuition behind advanced methods without requiring you to already be an expert in advanced mathematics. The end of each chapter has a programming lab which shows you how to implement the concepts in the chapter.
- Alex Gold DevOps for Data Science
This book is about the design considerations and software infrastructure which are required for deploying machine learning based products in production. It covers topics which are difficult to learn outside an organization. We will cover some of the tools in this book throughout the semester and this will be important for your final project.
- One book on causal inference in a machine learning context. We are going to have one module on causal inference, which is not covered in Introduction to Statistical Learning. Several other resources will be available, each of which approaches this from a machine learning context:
Matheus Facure Alves. Causal Inference for The Brave and True. This is a good introduction to causal inference which focuses on situations where machine learning is insufficient to solve business problems.
Moritz Hardt and Benjamin Recht. Patterns, Predictions, and Actions: A story about machine learning. This is an excellent book which provides an alternative perspective on machine learning. It has two excellent chapters on causal inference. It is mostly a conceptual book, so it doesn’t discuss a wide variety of models like Introduction to Statistical Learning, and it requires a lot more mathematical background. However it could be quite useful to you.
Mutlu Yuksel and Yigit Aydede. Causal Inference and Machine Learning: In Economics, Social, and Health Sciences. This is a complete book on machine learning from a causal inference perspective.
Optional
- Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Elements of Statistical Learning
This is the PhD level textbook version of the course textbook. It goes into all of the details. Look here if you want to go deeper than the course.
- Yasser Abu-Mostafa, Malik Magdon-Ismail and Hsuan-Tien Lin. Learning from Data
This book is for you if you want a very clear introduction to the theory of machine learning that is not too mathematically demanding. It will teach you challenging theoretical concepts like the VC dimension in the context of very simplistic models. Therefore, this book is very conceptual, only at the very end does it teach you any practical algorithm you would use to solve a real world problem, but it is great at teaching you what is really happening when you try to train an algorithm.
- Aurelien Geron. Hands on Machine Learning with Sci-Kit Learn and PyTorch.
This is a very well regarded hands-on book that teaches machine learning with a focus on the libraries we are using in this class (Sci-Kit Learn and PyTorch, the latter being a new addition).
- Kevin Murphy. Probabilistic Machine Learning.
Extremely comprehensive book with a no-nonsense style that some will appreciate. An alternative to Elements of Statistical Learning if you find them too wordy. Very advanced.
- Andriy Burkov. The 100 Page Machine Learning Book
If you need something concise.